Introduction

In this project, we dive into the world of instant noodles, aka ramen. Ramen noodles were invented by a Japanese businessman named Momofuku Ando in 195 (Wikipedia contributors 2021b). He created the first precooked instant noodles to increase the production and the distribution of noodles and to help fight hunger in post-war Japan. Since then, because of its convenience, ramen noodles have become popular all over the world and have been adapted to various cuisines. Its cheap price also helps win popularity, especially among students like us.

We came across the data set on the Ramen Rater’s website (Lienesch 2021). It was created by a single ramen enthusiast, with over 3950 reviews on all kinds of instant noodles one can possibly find in stores (Lienesch, n.d.).

As ramen buyers, we are interested in figuring out what features are important in finding a good ramen and how we can use these features to predict whether a pack of ramen is good or bad. We believe that our analysis can provide insights to ramen lovers and general consumers during their purchases in stores and help ramen companies decide their marketing strategies to increase sales.

The Dataset

The original data set contains 3950 rows and 7 columns. Each observation in the data set is a review for a single ramen product. The features include a review number, where bigger number represents a more recent review, the brand, the product’s name, its manufacturing country, packaging style (such as cup or bowl), a star rating, which range from 0 to 5 inclusively with 0.25 increments, and a boolean value indicating if the product is in the world’s top ten ramen list. Note that the stars represent the reviewer’s personal preference and are not based on sales or popularity.

Our prediction is the star rating. We decide to convert the rating to a binary variable using a threshold of 3.5, with 0 (Star < 3.5) being bad ramen and 1 (Star >= 3.5) being good ramen. This threshold is set by the original reviewer himself.

Exploratory Data Analysis

To understand the data better, we explore to visualize the distribution of the country of origins of all products. It seems that most products come from China, South Korea, Japan, and the USA.

Figure 1. Origins of Ramen Products

Figure 1. Origins of Ramen Products

There are many variety and the below word cloud displays the most common keywords in ramen descriptions. Wow, these noodles are created with so many flavors! They also come in with different packaging. A half of the sample come in as a pack. But some are sold in a bowl or tray, which are more convenient for direct usage.

Figure 2. Word Cloud of Ramen Variety and Package Style HistogramFigure 2. Word Cloud of Ramen Variety and Package Style Histogram

Figure 2. Word Cloud of Ramen Variety and Package Style Histogram

Let’s see how the ratings distribute. It looks like most ramens are quite tasty! But there are a few that received a zero star.

Figure 3. Histogram of Ratings

Figure 3. Histogram of Ratings

Methods

For the preprocessing, we apply One Hot Encoding to transform brand, country, and style and use bag-of-word to process variety feature. We drop top ten because its values are very sparse and review # which only acts as an identifier. Note that our target now is a binary variable indicating whether the product is good or bad. Our question is about classification.

For model selection, we tried 4 models: CatBoost, Logistic Regression, Random Forest, and SVM. For feature selection, we used one wrapper algorithms (boruta algorithm) and recursive feature elimination. We finally chose CatBoost and Boruta Algorithm selected features as our final model’s setup. The final model has the high valid accuracy (0.758) and a small accuracy gap (0.046) between test data set and train data set with only 77 features, as shown below.

Figure 4. Test accuracy and Train/Test accuracy gap of different combinations

Figure 4. Test accuracy and Train/Test accuracy gap of different combinations

Five-fold cross validation and random search are used to optimize the model. After tuning hyperparameters, we use {‘learning_rate’: 0.078, ‘max_depth’: 5, ‘n_estimators’: 600} as our final parameters. Since there was class imbalance in the target (0.7 vs. 0.3), we trained the model with class_weights="balanced".

Results

Our final CatBoost model gives a precision score of 0.760, a recall score of 0.954, and a F1 score of 0.847 on the test data set. These scores are quite high and demonstrate good classification performance.

As below, these two tables shows us the valid and test’s performance.

Table 1. Validation and Train Performance
valid_accuracy train_accuracy valid_f1 train_f1 valid_recall train_recall valid_precision train_precision
0.7626582 0.8378165 0.8590226 0.9017729 0.9682203 0.9968220 0.7719595 0.8232721
0.7420886 0.8401899 0.8440191 0.9027911 0.9343220 0.9936441 0.7696335 0.8271605
0.7436709 0.8354430 0.8465909 0.9005736 0.9470339 0.9978814 0.7654110 0.8205575
0.7626582 0.8405854 0.8587571 0.9032877 0.9661017 0.9968220 0.7728814 0.8258008
0.7784810 0.8279272 0.8679245 0.8961566 0.9745763 0.9941737 0.7823129 0.8157323
Table 2. Test and Train Performance
test_accuracy train_accuracy test_f1 train_f1 test_recall train_recall test_precision train_precision
0.7468354 0.8272152 0.8470948 0.8959207 0.9719298 0.9957627 0.7506775 0.8142758

As below, the plot shows us the confusion matrix of the CatBoost.

Interpretation

Shapley values are used here to explain the CatBoost as below. We can see that there are more good ramens associated features than those of bad ramens. It makes sense because the dataset tends to score positively to ramens. Good ramen noodles are usually associated with features like being brand Samyang Foods or Nissin, having description keyword “goreng” (which refers to fried food in Southeast Asian cuisine (Wikipedia contributors 2021a)), and are made in Japan or Indonesia. On the other hand, bad ramen noodles are associated with features like being cup noodle or made in United States. Now whenever you are craving for quick, simple, and tasty ramen noodles, remember to come back for this plot!

Critique

First of all, the amount of data used to build the model is relatively small, which may have a certain impact on the model performance. Secondly, the feature Top ten was not used in the analysis process. In the future, we hope to make reasonable use of this indicator after learning more data processing methods. Lastly, we recognize that the data set contains reviews done by a single person, which makes our prediction model very subjective and not generalizable for the general audience. One shall proceed with caution when using this result as a shopping guide.

References

Lienesch, Hans "The Ramen Rater". 2021. “The Ramen Rater.” THE RAMEN RATER. https://www.theramenrater.com/.

Wikipedia contributors. 2021a. “Goreng — Wikipedia, the Free Encyclopedia.” https://en.wikipedia.org/w/index.php?title=Goreng&oldid=1012204146.

———. 2021b. “Momofuku Ando — Wikipedia, the Free Encyclopedia.” https://en.wikipedia.org/w/index.php?title=Momofuku_Ando&oldid=1059586733.